BONN: Bayesian Optimized Binary Neural Network

77

FIGURE 3.21

The images on the left are the input images chosen from the ImageNet ILSVRC12 dataset.

Right images are feature maps and binary feature maps from different layers of BONNs.

The first and third rows are feature maps for each group, while the second and fourth rows

are corresponding binary feature maps. Although binarization of the feature map causes

information loss, BONNs could extract essential features for accurate classification.

Weight Distribution Figure 3.23 further illustrates the distribution of the kernel weights,

with λ fixed to 1e4. During the training process, the distribution gradually approaches

the two-mode GMM, as assumed previously, confirming the effectiveness of the Bayesian

kernel loss in a more intuitive way. We also compare the kernel weight distribution between

XNOR-Net and BONN. As shown in Fig. 3.24, the kernel weights learned in XNOR-Net

are tightly distributed around the threshold value, but those in BONN are regularized in a

Epoch

0

10

20

30

40

50

60

70

Accuracy

10

15

20

25

30

35

40

45

50

55

60

Top-1 on ImageNet

BONN-Train

BONN-Test

XNOR-Train

XNOR-Test

Epoch

0

10

20

30

40

50

60

70

Accuracy

20

30

40

50

60

70

80

Top-5 on ImageNet

BONN-Train

BONN-Test

XNOR-Train

XNOR-Test

FIGURE 3.22

Training and test accuracies on ImageNet when λ = 1e4 shows the superiority of the

proposed BONN over XNOR-Net. The backbone of the two networks is ResNet-18.